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We find Umiting distributions of the nonparametric maximum 
likelihood estimator (MLE) of a log-concave density, that is, a den- 
sity of the form /o = exp ipo where (po is a concave function on R. 
The pointwise limiting distributions depend on the second and third 
, derivatives at of Ht, the "lower invelope" of an integrated Brow- 

nian motion process minus a drift term depending on the number 
(-H ^ of vanishing derivatives of ipo = log/o at the point of interest. We 

also establish the limiting distribution of the resulting estimator of 
the mode M(/o) and establish a new local asymptotic minimax lower 
bound which shows the optimality of our mode estimator in terms of 
both rate of convergence and dependence of constants on population 
values. 

> , 

^ ■ 1. Introduction. 

I 1.1. Log-concave densities. A probability density / on the real line is 

00 ■ called log-concave if it can be written as 

o . 

. f{x) = exj)ip{x) 



for some concave function (/? : R ^ [— oo, oo). We let CC denote the class of all 
log-concave densities on M. As shown by Ibragimov (1956), a density func- 
tion / is log-concave if and only if its convolution with any unimodal density 
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is again unimodal. Thus, the class of log-concave densities is often referred 
to as the class of "strongly unimodal" densities. Furthermore, the class CC 
of log-concave densities is exactly the class of Polya frequency functions 
of order 2, PFF2 as noted by Pal, Woodroofe and Meyer (2007); see also 
Dharmadhikari and Joag-Dev (1988), page 150, and Marshall and Olkin (1979), 
page 492. 

The log-concave shape constraint is appealing for many reasons: 

(1) Many parametric models, for a certain range of their parameters, are 
in fact log-concave, for example, normal, uniform, gamma(r. A) for r > 1, 
beta(a,6) for a > 1 and 6 > 1, generalized Pareto, Gumbel, Frechet, logistic 
or Laplace, to mention only some of these models. Therefore, assuming log- 
concavity offers a flexible nonparametric alternative to purely parametric 
models. Note that a log-concave density need not be symmetric. 

(2) Every log-concave density is automatically unimodal. Furthermore, 
log-concavity of a density / immediately implies specific shape constraints 
for certain functions derived from / [see Barlow and Proschan (1975), Mar- 
shah and Olkin (1979, 2007), Dharmadhikari and Joag-Dev (1988), An (1998) 
and Bagnoli and Bergstrom (2005)]. Thus, having an estimator (and its lim- 
iting distribution) for / at hand provides, almost automatically, estimators 
(and limiting distributions) for those functions. Corollary 2.3 illustrates this 
for the hazard rate. 

(3) Although the nonparametric MLE of a unimodal density does not 
exist [see, e.g., Birge (1997)], the nonparametric MLE of a log-concave den- 
sity exists, is unique and has desirable consistency and rates of convergence 
properties. Thus, the class of log-concave (or strongly unimodal) densities 
may be a useful and valuable surrogate for the larger class li of unimodal 
densities. 

(4) Tests for multimodality and mixing can be based on a semiparamet- 
ric model with densities of the form fc,ip{x) = ex.p{ip{x) + crc^), where is 
concave and c > 0, as shown by Walther (2002). 

(5) Chang and Walther (2007) further show that the EM-algorithm can 
be extended to work for log-concave component densities. 

(6) First attempts to estimate a log-concave density in were made by 
Cule, Gramacy and Samworth (2007). 

(7) The log-concave density estimator can be used to improve accuracy 
in the estimation of the so-called "tail index" of a generalized Pareto distri- 
bution [see Miiller and Rufibach (2009)]. 

(8) It should be noted that no arbitrary choices such as bandwidth, kernel 
or prior are involved in the estimation of a log-concave density; these are all 
obviated by this shape restriction. 

(9) We expect good adaptivity properties of the MLE /„ in the class CC. 
For properties of (random variables with) log-concave densities, we re- 
fer to Dharmadhikari and Joag-Dev (1988), Marshall and Olkin (1979) and 
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Rufibach (2006). Log-concavity of a density / implies certain shape con- 
straints for functions derived from /, such as the distribution function, the 
tail or hazard function. See An (1998) for comparisons with the related 
notion of a log-convex density. 

1.2. Log-concave density estimation. Now let X(i) < X(2) < • • • < ^(n) 
be the order statistics of n independent random variables Xi, . . . ,X„, dis- 
tributed according to a log-concave probability density /o = expipQ on R. 
The distribution function corresponding to /o is denoted by Fq. 

The maximum likelihood estimator (MLE) of a log-concave density was 
introduced in Rufibach (2006) and Diimbgen and Rufibach (2009). Algo- 
rithmic aspects were treated in Rufibach (2007) and in a more general 
framework in Diimbgen, Hiisler and Rufibach (2007), while consistency with 
respect to the Hellinger metric was established by Pal, Woodroofe and Meyer 
(2007), and rates of convergence of fn and F„ were established by 
Diimbgen and Rufibach (2009). Since the derivation of the MLE of a log- 
concave density is extensively treated in these references, we only briefly 
recall its definition and the properties relevant for this paper. 

If C denotes the class of all concave functions (^:M ^ [—00,00), the esti- 
mator (fn of fo is the maximizer of the "adjusted" criterion function 

F{^) = / 'f{x)d¥n{x)— f expip{x)dx 
Jr Jr 

over C, where F„ is the empirical distribution function of the observations. 
The log-concave density estimator is then := exp^„, which exists and is 
unique. 

1.3. Characterization of (pn- For any continuous piecewise linear func- 
tion hn '■ — > M, such that the knots of hn coincide with (some of) 

the order statistics A'(i), . . . introduce the set of knots 5„(/i„) of hn as 

Snihn) := {t G (X(i),X(„)) :/i;(t-) > h'^{t+)} U 

Diimbgen and Rufibach (2009) found that (fn is piecewise linear, that (fn = 
—00 on M \ and that the knots of fn only occur at (some of 

the) ordered observations X^i-^ < ■ ■ ■ < . The latter property is entirely 
different from the estimation of a /c-monotone density for k> 1 (see below), 
where the knots fall strictly between observations with probability equal to 
1. 

According to Theorem 2.4 in Diimbgen and Rufibach (2009), the estima- 
tor fn has the following characterization. For x > X^^-^ (recall that fn '■= —00 
outside [X(]^),X(^)]), define the processes 

eM^nit))dt, Hn{x):= Fn{t)dt, 
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■X 



Unix) := / ¥n{t)dt 



— oo 



¥n{t)dt. 



Then, the concave function (pn is the MLE of the log-density ipQ if, and only 



1.4. Other shape constraints. Maximum likelihood estimation of a mono- 
tone density /o on [0,oo) was first studied by Grenander (1956). Under the 
assumption that /o is in a neighborhood of a point xq > 0, such that 
/o(xo) < 0, Prakasa Rao (1969) established the (local) asymptotic distribu- 
tion theory of the Grenander estimator /„: 



where Z is the slope at zero of the (least) concave majorant of the process 
W{t) — t^, t G M for two-sided Brownian motion W starting at 0. 

Under the assumption that the true density /o is convex on [0,oo) and 
that /o is in a neighborhood of xq with /g (a^o) > 0, Groeneboom, Jong- 
bloed and Wellner (2001b) show that the MLE (as well as the least 
squares estimator of /q) satisfies 



where EI is a particular upper invelope of an integrated two-sided Brownian 
motion +t^ [see also Groeneboom, Jongbloed and Wellner (2001a)]. 

The classes of monotone and convex decreasing densities are particular 
cases of the class of A;-monotone densities. Modulo a spline interpolation 
conjecture, Balabdaoui and Wellner (2007) were able to adapt the approach 
of Groeneboom, Jongbloed and Wellner (2001b) to this general class of den- 
sities. 

We find that log-concave estimation shares many similarities with the 
aforementioned shape-constrained estimation problems. In particular, the 
limiting distribution of the MLE, our nonparametric estimator, involves a 
stochastic process whose second derivative is concave and which stays below 
an integrated Brownian motion minus t^^'^ . The even integer k determines 
the number of vanishing derivatives of the true concave function ipQ at the 
estimation point xq. Using Theorem 2.1, one can derive a procedure for 
estimation of k. This is relevant in practical applications of our results, 
that is, construction of confidence intervals for the mode using the limiting 
distribution given in Theorem 2.1. These problems are the subject of ongoing 
research. 



(1.1) 




n 



iUxo) - /o(xo)) ^ \Uxo)Uxo)/2\^'^'L 



n 



'^Hfnixo) - /o(^o)) ^ (24"Vo'(^o)/^(xo))'/'lH"(0) 
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1.5. Organization of the paper. In Section 2, we establish the limiting 
distributions of the ML estimators, ipn and /„, at a fixed point xq G M under 
some specified working assumptions. The characterization of either (pn or 
fn given in (1.1) coincides, except for the direction of the inequality, with 
that of the least-squares estimator of a convex decreasing density, studied 
by Groeneboom, Jongbloed and Wehner (2001b); see their Lemma 2.2, page 
1657. This enables us to adopt the general scheme of the proof in their paper. 

Log-concave densities / and their logarithm ip can easily have vanish- 
ing second and higher derivatives at fixed points; an explicit example will 
be given in Section 2. Thus, the formulation of our asymptotic results al- 
lows higher derivatives of the concave function ipo to vanish at the es- 
timation point. This is somewhat more general than the assumptions of 
Groeneboom, Jongbloed and Wellner (2001b) (where a natural assumption 
is that the second derivative is positive at the point of interest, but simi- 
lar vanishing of second derivatives and existence of a nonzero higher order 
derivative can also easily occur) , but it is analogous to the results of Wright 
(1981) and Leurgans (1982) for nonparametric estimation of a monotone 
regression function. Similar results for the Grenander estimator of a mono- 
tone density are stated by Anevski and Hossjer (2006). We find that the 
respective limiting distributions of the MLE and its first derivative depend 
on a stochastic process, Hk, equal almost surely to the "lower invelope" (or 
just "invelope") on M of the integrated Brownian motion minus t'^^^, where 
k is the order of the first nonzero derivative of tpo at the point of interest. 

In Section 3, the estimation point xq is taken to be equal to the mode, mo, 
defined to be the smallest point in the modal interval of the log-concave den- 
sity /o ■ A natural estimator of mo , which we denote by Mn , can be taken to 
be the smallest number maximizing the MLE (fn or, equivalently, the small- 
est number maximizing the MLE /„. In this section, we establish our second 
main result: the asymptotic distribution of M„,. Under the assumption that 
the second derivative /o ("^o) < 0, we show that this distribution depends on 

(2) 

the random variable defined to be the argmax or mode of H2 on M. When 
the second, third and higher derivatives of order /c — 1 or lower vanish at mo 
but /Q^^(mo) < 0, then the limit distribution depends on the mode of Hj^\ 
Proofs are deferred to Section 4. 

To illustrate all the quantities for which we provide limiting distributions, 
in Figure 1 we give plots of /„, (pn, Fn and A„ = fn/ (1 — Fn), based on two 
samples of sizes n = 20 and n = 200 drawn from a Gamma(2, 1) density 
fo{x) = 3;e~^l[o^oo)(^)- All these plots were generated using the R-package 
logcondens [see Rufibach and Diimbgen (2007)]. 
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Fig. 1. Examples for log-concave density, log-density, CDF, and hazard rate estimation 

for n — 20,200 ( true functions, — estimators). The dotted vertical lines indicate the 

set Sniifin)- The ■ — ■— vertical lines are placed at the mode of the estimated density. 
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2. Limiting distribution theory. To state the main result, we make the 
following assumptions. 

2.1. Assumptions. Fix xq G M. We suppose that the true density /o = 
expipo satisfies the following assumptions: 

(Al) The density function /o G CC. 
(A2) /o(xo)>0. 

(A3) The function ipo is at least twice continuously differentiable in a 
neighborhood of xq. 

(A4) If ipq{xq) 7^ 0, then k = 2. Otherwise, suppose that k is the smallest 

(i) (k) (k) 

integer such that (/3q (xq) = 0, j = 2, . . . , A; — 1, and ip^ (xq) / 0, and tp^ is 
continuous in a neighborhood of xq. 

Note that concavity of ipo and (A3) and (A4) imply that k is necessarily 

(k) 

even and that c^q (xq) < 0. Indeed, suppose that k > 2. Using Taylor expan- 
sion of ifQ up to degree k — 2, there exists a small h > for which we can 
write 

{k-2)\ 

Since '■Pq{x) < for all x G [xq — h,XQ + h], it follows that A; — 2 is even [i.e., 
k is even and ipQ'\xo) < 0]. 

2.2. Notation. Let W denote two-sided Brownian motion, starting at 0. 
For t G M, define: 



^oi^) = Tr^(^ - ^o)'-' + o{{x - xo)'=-2), X G [xo - h, Xo + h]. 



(2.1) Yk{t) 



rt 

/ VF(s) - ift>0, 
Jo 

rO 

J W{s)ds-t''+^, ift<0. 



For the uniform norm of a bounded function /, we write ||/||oo = sup^g^ |/(x)| . 
The derivative of at x G M is as usual denoted by ^^(x). However, if 
X G Sni'fn), then we define ^^(x) as the left-derivative. 

Theorem 2.1. Suppose that (A1)-(A4) hold. Then, 

n'=/(2'=+i)(/„(xo)-/o(xo)) \ d I c,(xo,v^o)i^f (0)\ 



„(.-i)/(2.4-i)(;/(^^)_^/(^^)); V4(xo,^o)i?f (0). 
and 

n'^/{2fe+i)(^„(xo)-(^o(:ro)) \ d ( Ck{xo,^o)H^^\Q) 
_^{.-i)/(2fc+i)(^/^(^^)_^.(^^))j - l^Z),(xo,^o)i^f (0), 

where Hf^ is the "lower invelope" of the process Y^; that is, 
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Hk{t)<Yk{t) for allteR; 
Hj^ IS concave; 

(2) 

Hk{t) = Yk[t), if the slope of Hj, decreases strictly at t. 
The constants c^, d^, and are given by 

(A: + 2)! J 

fo{xo)'-'Vo\xo)\'^'^^"'^'^ 



P + 2)!P 



/o(xo)^-(fc + 2)!. 



(2.2) Ckixo,(po) 

(2.3) dk{xo,^o) 

(2.4) Ck{xo,ipo) 

Corollary 2.2. Suppose that (Al)-(A4) hold with k = 2. Then, 

n^^Hfnixo) - fo{xo))\ d (c2{xo,^o)H^^\o)\ 
nV5(/;(xo)-/^(xo))y \d2{xo,^o)HP{0)) 

and 

ny^{ipn{xo)-Mxo)) \ 4 ( C2{xo,ipo)Hi^\o)\ 
n^^H'f'nixo) - ^oixo)) J \D2{xo,^o)Hi^\o) J ' 

where H2 is the ( concave ) invelope of the process Y2 ; that is, 

H2{t)<Y2{t) for allteM.; 
H2 is concave; 

(2) 

H2{t) =Y2{t) if the slope of H2 decreases strictly at t. 

The constants C2, d2, C2 and D2 are given by (2.2)-(2.5), with k = 2. 

Note that the constants (72(2:0, (/Jq) and D2{xo,ipo), up to inversion of 
/o(xo), exhibit a structure very similar to that of the constants given by 
Groeneboom, Jongbloed and Wellner (2001b) in the problem of estimating 
a convex density go on [0,oo). We recall here that, in the latter problem, 
those constants are found to be equal to (we use our notation to make the 
comparison easy) 
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It is clear that ipQ in the log-concave problem plays exactly the same role 
as /o in the problem of estimating a convex density. However, in the first 
case estimation is based on observations which are distributed according 
to exp990) whereas in the latter the data come from /o itself. A good in- 
sight into the difference between the expressions of the asymptotic con- 
stants can be gained from the proof of Theorem 4.6 in Section 4. There, 
we show that the leading coefficient of the drift of the limiting process 
Yk depends on (/q\xq) fQ{xQ) = fo'\xo) - {ip'q{xo))'' fo(xo), where the sec- 
ond term is "filtered out" in the Taylor expansion of the estimation er- 
ror in the neighborhood of xq. Hence, |(/9q'^^(xo)| • fo{xo) can be viewed as 

the dominating term replacing \gQ (xo)| in the convex estimation prob- 
lem. For k = 2, the constants C2{xo,ifo) and d2{xo,ifo) given in (2.2) and 
(2.3), with k = 2, match closely with C2(a;o,5o) and d2{xQ,go) obtained 
by Groeneboom, Jongbloed and Wellner (2001b) in the convex estimation 
problem, with /o(xo) in the numerator, whereas /o(a^o) shows up in the de- 
nominator in the asymptotic constants C2(xo,y?o) and D2{x(),ip()). This re- 
sults from applying the delta-method to fn{xo) = exp(^.„(xo)) and fn{xo) = 
^n{xo)fn{xo), which yields C2{xo,^po) and D2{xo,ipo). 

Here is an explicit example showing how vanishing second (and higher) 
derivatives can occur. Consider the density function 

/o(x) = V2^^^exp(-x^), xGM. 

TT 

In this case ipQ\xo) = 0,j = 1, 2, 3 for xq = 0, and ip^\xQ) / 0. The following 
"tilted" version of /o shows that vanishing second derivatives of 930 can also 
occur at points other than the mode of /: 

/o(x) = exp(a + bx)fQ{x) = aexp(6x — x^), 

where d = d{h) := 1/ J^exp{bx — x'^)dx; in this case, (po := log/o satisfies 
(^o(O) = 0, but the mode mo := M(/o) = (6/4)^/3 > when 6 > 0, and (^o(mo) = 
-12(6/4)2/3 <0. 

Finally, and in order to compare also the random parts of the limits in the 
convex and log-concave estimation problems, we would like to note that for 
our lower invelope process -ff^, —H/^ has the same distribution as the "upper 
invelope" of — Ifc, which was called just the "invelope" in the case /c = 2 by 
Groeneboom, Jongbloed and Wellner (2001b): The process — has a drift 
equal to plus t^'^'^, which specializes to in the convex density problem with 
k = 2. This "upper invelope" stays above — Yfc and admits a convex second 
derivative. Since —W has the same distribution as W, it follows that the 
upper and lower invelopes Hfc and (associated with estimation of convex 

and concave functions, resp.) satisfy Hfc = —Hf^. Since the derivatives at 



10 



F. BALABDAOUI, K. RUFIBACH AND J. A. WELLNER 



zero (0) and Wj, (0) of are distributed symmetrically about zero, 

the same is true of the derivatives at zero Hj^\o) and Hj^^\o) of H^. 

As shown by Barlow and Proschan (1975), Lemma 5.8, page 77 [see also 
Marshall and Olkin (1979), page 493; Marshall and Olkin (2007), page 102; 
An (1998) and Bagnoli and Bergstrom (2005)], if /o is log-concave, then the 
hazard function 



Ao(x) 



fo{x) 



l_i7o(^)l{-<i-cr^a)} 
is monotone nondecreasing. Defining the estimator of Aq based on as 

'\ / \ fn{x) 

1 - Fn{x) 

application of the delta-method yields the following corollary. 

Corollary 2.3. Suppose that (Al)-(A4) hold. Then, 

n'=/(2'=+i)(A„(xo)-Ao(xo)) \ d ( g,{xo,^o)Hi'\o)\ 
.n('^-'^/^''^'HZixo)-X'o{xo))J " U.^(^o,^o)4'^(0)y' ' 
where the constants and hj. are given by 

9k{xo, (po) = Ck{xo, 9?o)/(l - Fo{xo)) 
hk{xo, ifo) = dk{xo, ipo)/{l - Fo{xo)). 

3. Inference about the mode of /q. Estimation of the mode of a uni- 
modal density has been considered by many authors [see, e.g., Parzen (1962), 
Chernoff (1964), Grenander (1965), Dalenius (1965), Venter (1967), Wegman 
(1970a, 1970b, 1971), Eddy (1980, 1982), Hall (1982), Miiller (1989), Ro- 
mano (1988), Vieu (1996) and, more recently, Meyer (2001) and Herrmann 
and Ziegler (2004)]. 

Empirical studies of the performance of various estimators are given by 
Dalenius (1965), Ekblom (1972), Meyer (2001) and Meyer and Woodroofe 
(2004). Many of the methods considered for estimating the mode of a uni- 
modal smooth density use kernel estimation, but others are based on the 
principle of substitution with another choice of estimator of the population 
density. For example, the estimators of Venter (1967) are related to nearest- 
neighbor estimators of the density /q. All the estimators of the mode in the 
class of unimodal densities known to us involve some more or less ad hoc 
choice, essentially because the maximum likelihood estimator of a unimodal 
density is not well defined, as explained by Birge (1997). [Note that Weg- 
man (1970b, 1971) discussed the nonparametric MLE of a unimodal density 
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subject to a constraint on the height of the mode; without some constraint 
of this type, the MLE does not exist.] 

For virtually all of the estimators of which we are aware, some choice 
of a smoothing parameter, bandwidth or constraint is required. Empirical 
choice of smoothing parameters has been studied by Miiller (1989), who 
studied local methods of choosing the smoothing parameter, Grund and Hall 
(1995), who studied bootstrap methods, and Ziegler (2004), who studied 
plug-in methods. Klemela (2005) gave a construction of adaptive estima- 
tors based on Lepski's method [Lepskii (1991, 1992)]. For nonparamet- 
ric Bayes estimators of unimodal densities and, hence, of the mode [see 
Brunner and Lo (1989) and Ho (2006a, 2006b)]; for these estimators, choice 
of a prior is equivalent to a choice of smoothing parameters. 

In contrast, estimation in the (large) subclass of log-concave (or strongly 
unimodal) densities is much simpler, avoiding bandwidth or smoothing pa- 
rameter choices completely. Since the maximum likelihood estimator exists, 
we can simply estimate the mode by the mode (or smallest point in a modal 
interval) of the MLE /„. Using the notation introduced by Eddy (1982) 
[and also used by Romano (1988)], we let M„ := M{fn) where M denotes 
the mode functional (or "smallest argmax" functional) given by 



discussed in Section 1, we expect M„ to adapt to different local smoothness 
(or peakedness) hypotheses on /o [much as the Grenander estimator is locally 
adaptive in the case of estimating a monotone density, see, for example, Birge 
(1989), page 1535]. Here, we study M„ as an estimator of the mode -/Vf(/o) := 
mo under just the condition that /o has a continuous second derivative /q 
in a neighborhood of mo, with /q (mo) < 0. We begin in the next subsection 
with a new asymptotic minimax lower bound for estimation of mo under 
this hypothesis. The following subsection gives our new limiting distribution 
result for the MLE M„ of the mode mo. 

3.1. New lower bounds for estimating the mode. Has'minskii (1979) es- 
tablished a lower bound for estimation of the mode mo of a unimodal density 
/ G Z//, assuming that / satisfies /"(mo) < 0. He showed that the best local 
asymptotic minimax rate of convergence for any estimator of mo is n~^/^. 
Has'minskii based his proof on a sequence of parametric submodels of the 
form 




t : q(t) = maxqC 




Because of the adaptive properties of the MLE's /„ of /o and ipn of (po 



fn{x,e) = fix) + en-2/55(ni/5(x - mo)). 
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where, for a := —/"(mo), 



gix) ■.= ga{x) 



X, if |x| < 1/a, 
0, if|x|>K>l/a 

and g := ga satisfies g{—x) = —g{x) and |5'"(ic)| < a/2 for all 2: G M. How- 
ever, Has'minskii (1979) did not study the dependence of the local min- 
imax bound on a = —f"{mQ) and f{mQ), leaving his bound in terms of 
Cg := /(m-o)/ / ga{x) dx involving the still unspecified function g = ga- 

Here, we consider different parametric submodels and derive the depen- 
dence of the constant in local asymptotic minimax lower bound for estima- 
tion of the mode rriQ in the family CC of log-concave (or strongly unimodal) 
densities. 

We want to derive asymptotic lower bounds for the local minimax risks for 
estimating the mode M{f). The Li-minimax risk for estimating a functional 
V of fo, based on a sample Xi , . . . , X„ of size n from /o , which is known to 
be in a subset CCn^r of CC is defined by 

(3.1) MMRi{n,Tn,CCn,r)--=mf sup Ef\Tn-u{f)\, 

where the infimum ranges over all possible measurable functions r„ = tn{Xi, 
. . . , Xn) mapping M" to M. The shrinking classes CCn,T used here are Hellinger 
balls centered at /q: 

CCn,r = [f^CC: H\f, /o) = i J^J\fm- dz < T/ny 

Consider estimation of 

(3.2) u{f) := M{f) = infit eR:t = sup/(n)l. 

I u<=R J 

Let /o G CC and niQ = M(/o) be fixed, such that /o is twice continuously dif- 
ferentiable at niQ and /q (?7t,o) < 0. Consider the family {ipe}e>o and resulting 
family {/e}e>o, defined as follows much as: 

f ^oix), x<mo- ece, 

ifoix), x>mo + e, 

+ v?o(mo -I- e){x -mo - e), x e [mo - e,mo + e], 

930 ("^0 - ece), 

+ ipQ{mo - ece){x - mo + eCe), x£ [mo - ec^, mo - e), 

where Cg is chosen so that tp^ is continuous at mo — £■ Note that if (po{x) = 
7 ~ loix — mo)^, then = 3, for all e, and — > 3, as e | 0, since /q (mo) < 0. 
Now define 

he{x) 



hs{x) := exp{ipe{x)) and /^(x) : = 



Jhe{y)dy' 
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Then, is log-concave for each e > with mode tuq — e by construction, so 
with iy{fe) '■= M{fe) '■= the mode of fe, we have 

jyife) - Hfo) = M{f,) - M{fo) = mo-e-mo = -e. 

Furthermore, the fohowing lemma holds. 

Lemma 3.1. Under the above assumptions, 

H\feJo) = ^-il^e' + o{e') := pe' + o{e% 
5/o(?Tio) 

Proof. Proceeding as in Jongbloed (1995), 

1 roo I I 



I l-mo+e 
2 



'.\Jfsix) -\l fo{x)fdx 



mo—ec^ 

5 5 /o(mo) 

as e I 0. Calculations similar to those of Jongbloed (1995) [see also Jongbloed 
(2000) and Groeneboom, Jongbloed and Wellner (2001b)] complete the proof 
of the lemma. □ 

Taking e = cn~^/^ and defining /„ := f^n-'^/^ yields 

T^Un) - l^{k) = M{fn) - M(/o) = 

and 

nH\f^, /o) = IfP^c' + 0(1) := pc' + o(l). 

Plugging these into the lower bound Lemma 4.1 of Groeneboom (1996), with 
i{x) := \x\ , yields 

liminfinfni/5max{S„,pjT„ - M(/„)|, ^„,p|T„ - M(/o)|} 



> -«xp(-2p.») = = (0.15512) (^-^i^^^^j, 

by choosing c= (lOp)^^/^. This yields the following proposition. 
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Proposition 3.2 (Minimax risk lower bound). Suppose that v{f) = 
M{f), as defined in (3.2), and that CCn.r is as defined above where /g is 
continuous in a neighborhood of thq = M(/o) with /g (mo) < 0. Then, 

suplimsupn^/^inf sup Ef\Tn- M{f)\ 

^/ 5/2 y/V /o(mo) y/5 . / /o(mo) 



Remark 3.3. Note that the constant 6(/o,mo) := (/o(f^o)//o ("^o)^)^^^ 
appearing on the right-hand side of this lower bound is scale equivariant 
in exactly the right way: if fdx) := fo{n^o + {x — mQ)/c)/c for c > 0, then 
b{fc, mo) = cb{fo, mo) for all c> 0. The constant 6(/o, nio) will appear in the 
limit distribution appearing in the next subsection. 



Remark 3.4. If CC is replaced by the class U of unimodal densities 
on R and CCn,T is replaced by Un^r defined analogously where /o satisfies 
/q (mg) < and /q continuous in a neighborhood of mo, then a minimax 
lower bound of the same form as Proposition 3.2 holds with exactly the 
same dependence on 6(/o,mo) = (/o("io)//o ("^o)^)^^^) but with the absolute 
constant 0.15512 . . . replaced by 0.19784 .... This can be seen by taking the 
perturbations {/e}e>o defined by 

( fo{x), x<xo-e, 
fe{x) = < fo{x), x>xo + e, 

[ fo{xo) + be{x - xo + e), XQ-e<x<xo + e, 

where be is chosen so that fe{xo + e) > /o(a;o + e) and J^°^^ fe{x) dx = 

i::^:foix)dx. 



Remark 3.5. If 930 is continuously /c-times differentiable in a neighbor- 
hood of the mode mo, (^[/''(mo) = for j = 2, . . . ,k — 1 and ipQ^\mQ) 7^ 
[assumption (A4)], then it can be shown that the minimax rate of conver- 
gence is n^/(^'^+^) and that the minimax lower bound is proportional to 

( 1 ^ / /o(mo) 

Vo("io)y^f'^(mo)2/ V/r(mo)2/ 

where the proportionality constant depends on the largest root of the poly- 
nomial x^ — {k/{k — l))^'^"^ — {2k —l)/{k — l) (which equals 3 when k = 2). 
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3.2. Limiting distribution for the MLE Mn in CC. Now, let /„ be the 
MLE of / in the class CC of log-concave densities, and let M„ = M{fn), 
niQ = M(/o). Here is our result concerning the limiting distribution of M„ 
under the same assumptions on /o as in the previous section on lower bounds. 

Theorem 3.6. Suppose that /q is continuous in a neighborhood of mo = 
M(/o) and that fo{mo) < 0. Then, 

Note that the limiting distribution depends on a multiple of the same 
constant 5(/o,mo), which appears in the asymptotic minimax lower bound 

(2) 

of Proposition 3.2, times a universal term M{H2 ), the mode of the "esti- 
mator" H2'\t) of the canonical concave function — 12t^ in the limit Gaus- 
sian problem: estimate the mode of fo{t) = —12t^, based on observation of 
Y{t)=Jl;X{s)ds, when 

dX{t)=fo{t)dt + dW{t). 
We expect that this distribution, namely the distribution of 

will occur in several other problems involving nonparametric estimation of 
the mode or antimode of convex or concave functions under similar sec- 
ond derivative hypotheses. For example, it seems clear that it will occur as 
the limiting distribution of the nonparametric estimator of the antimode of 
a convex bathtub-shaped hazard [in the setting of Jankowski and Wellner 
(2007)]; as the limiting distribution of the nonparametric estimator of the 
antimode of a convex regression function in the setting of Groeneboom, 
Jongbloed and Wellner (2001b); and as the limiting distribution of the non- 
parametric estimator of the mode of a concave regression function. 

When (^[/^(mo) = 0, for j = 2, . . . , — 1, ipQ'\mo) ^ 0, and ipo'^ is contin- 
uous in a neighborhood of itiq, then an analogous result (with a completely 
similar proof) holds: 

V/o(mo)lff*(mo)P/ ' 

In particular, when /c = 4, the rate of convergence is n^^^ , and the limit 
distribution becomes that of 

6lV„(m„) y/'' p) 
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Apparently, estimation of mo becomes considerably more difficult when the 
second and possibly higher order derivatives of ipo vanish at mo- 

On the other hand, if lpq (or equivalently, /o) is cusp-shaped at mo, then 
the rate of convergence of M„ is n^/^, and the local asymptotic minimax 
rate of convergence is also n^/^; we will pursue these issues elsewhere. 

4. Proofs for Sections 2 and 3. Throughout this section, we fix k and let 

— ^(fc+2)/(2fc+l) ^ j^-l/(2fc+l) ^ 

Xn{t) := Xn,k{t) := xo + Snt := xo + n-^/(2fc+i)^^ 
l.-l{xo,n,k,t).-^^^^^^^^^^^^ t<0. 

4.1. Preparation: technical lemmas and tightness results. First, some no- 
tation. 

Local processes: The local processes Y^°^ and H}^'^ are defined for t G M 

by 

Yi-(t) := r„ \¥n{v) - E„(xo) - \^J2 ^^^-JT^i^ " ^oY j duj dv 

and 

H':'{t):=rn / ifn{n)-Y.^^^^{u-xoy]dudv 

JXQ JXO \ -^Q ]■ ) 

+ Ant + Bn, 

where in the limit Gaussian problem: estimate the mode 

(4.1) An = rnSn{Fn{xo) - F„(xo)) and 

(4.2) Bn = rn{Hn{xQ)-^n{xQ)). 

We also define the "modified" local processes 

Ylocmod(^) := ^!L^ /""^*^ ( F„(7;) - F„(xo) 



(4.3) - £ i^jl^[u - x,y^ d^^ dv 

Xn{i) rv ^ 

/ 'ifk,n,2{u)dudv 
Xo Jxo 
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and 



jiocmod^^) :=^^ T"^*^ r(^„(n) -(^o(xo) - {u-xo)ip'o{xo))dudv 



(4.4) 

where ^k,n,2 is defined below in (4.26). 

The following lemma uses the notion of uniform covering numbers [see 
van der Vaart and Wellner (1996), Sections 2.1 and 2.7] for complete defini- 
tions and further information. 

Lemma 4.1. LetT he a collection of functions defined on [xq — 5,xq + 5\, 
with > small and let s > 0. Suppose that for a fixed x ^[xq — 5,xq + 5] 
and R> 0, such that [x, x + R] C [xq — S, xq + S], the collection 

Tx,R = {fx,y ■■= / eJ^,x<y<x + R} 

admits an envelope Fx^r, such that 

EFl^{Xi) < KR^'^-\ R<Rg 

for some d>l/2 and K > 0, depending only on xq and 6. Moreover, suppose 
that 

sup / JlogN{r]\\Fx^R\\Q^2,^x,R,L2{Q)) dr] < oo. 
Q Jo ^ 

Then, for each e > 0, there exist random variables Mn of order Op{l) (not 
depending on x or y) and Rq > 0, such that 

I fx,yd(Fn-Fo) <e\y-x\'+'' + n-^'+''y^^'+'Hln for\y-x\<Ro. 

Proof. See Kim and Pollard (1990) and Balabdaoui and Wellner (2007), 
Lemmas 4.4 and 6.1. The special case s = 1 = d is Lemma 4.1 of Kim and Pollard 
(1990). □ 

Lemma 4.2. // (A3) and (A4) hold, then 

(4.6) fi'\xo) = [^'o{xoWfo{xo) forj = l,...,k-l 

and, for j = k 

(xo) = (4')(xo) + [v.'o(xo)]'=)/o(xo). 



(4.5) 
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Proof. The expressions for /q^ [xq) follow immediately from a recursive 

argument using the identity /o = ex.pipQ and the assumption ip^^\xo) = 0, 
for j = 2, ...,A;-1, if /c>2. □ 

Now, let r+ := inf{t G S{(pn) : t > xq} and t~ := sup{t G S{ipn) '■ t < xq}. 

Theorem 4.3. // (A1)-(A4) hold, then 

(4.7) r+-r- = 0,(n-V(2'=+^)). 

Theorem 4.3 should be compared to Theorem 3.3 of Diimbgen and Rufibach 
(2009). When their Theorem 3.3 is specialized to the case (3 = 2, so that 
(Pq{x) < C < 0, for all X e r := [^,-6], then it yields the following: If m„ 
denotes the number of elements in Sn{(pn) H T, then for any successive knot 
points ti-i and ti in 5„((^„) PlT, 

(4.8) sup {ti-ti.i)=Op{pl/''), 

i=2,...,mn 

where pn = log(n) /n. 

Proof of Theorem 4.3. From the first characterization of the esti- 
mator fn in Diimbgen and Rufibach (2009), for every function A such that 
(pn + tA is concave for a t > small enough, we know that 

(4.9) / A{x)dFn{x)< I A{x)dFn{x). 
This is equivalent to 

(4.10) / A(x)d(F„(x)-Fo(x))< / A{x){Ux)-h{x))dx. 



Using specific indicator functions for A, one can furthermore show that 

(4.11) F„(r)G [F„(T)-l/n,F„(T)] 

for every r G Sn{(pn) [see Rufibach (2006) and Corollary 2.5 of 
Diimbgen and Rufibach (2009)]. 

Now, the idea is to choose a particular permissible perturbation function 
A that satisfies the following two conditions: 

1. A is "local," that is, compactly supported on [t~ ,t^]. 

2. A should "filter" out the unknown error fn — fo- 

The second requirement means that A should be chosen so that 

(4.12) /_ A{x)dx = Q, A{x)[x-T)dx = Q, 
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where r := (r~ +t^)/2 is the mid-point of [T~,r^]. If this is guaranteed, 
then the right-hand side of (4.10) in the end wih only depend on the distance 
'^n ~Tn and fo{xo). 
Define Aq by 

Ao(x) = (x - T-)l[^- + (t+ - x)l[-^+](x). 

Since (pn + tAo is concave for small t>0, Aq is permissible. It is also com- 
pactly supported. However, since Aq is nonnegative, there is no hope that 
it fulfills the second of the requirements above. We therefore introduce a 
modified perturbation function 

Ai(x) = Ao(x) - i(r+ - r-)l[^-^^+j(x), x e R. 

Clearly, existence of a t > 0, such that (pn + tAi is concave, is no longer 
guaranteed. However, using (4.11), 

Ai(x)(i(E„-Fo)(x) 

= J Ai(x)d(F„-F„)(x)+ J Ai(x)d(i?,-Fo)(x) 

d{¥n-Fn)ix) + J Ai(x)(i(F„-Fo)(x) 



(4.13) < ^ 



(4.14) <ll-Il^ + J Ai(x)(/„-/o)(x)dx. 

To get the inequality in (4.13), we used (4.9) with A = Aq and (4.11). The 
next step is to get bounds for the integrals in the crucial inequality (4.14). 
Define 



Rin ■■= J Ai(x)(/„ - fo){x) dx 

and 

R2n:= J Ai{x)d{¥n-Fo){x). 
Rearranging the inequality in (4.14) and using these definitions yields 

—ri\n S ti2n- 

In 

Consistency of together with (/9q*^^(xo) < 0, implies t+ — r ~ = Op(l). 
Thus, it follows from Lemma 4.4 that 

Mfc(-(^(')(xo))(r+ - T-f^\\ + Op(l)) < Op(l)n-i + Op(r,;i) = O.ij-^). 

This yields the claimed rate, Op(?i~^/*^^*^+^)), for the distance between r+ 
and . □ 
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Lemma 4.4. Suppose (A1)-(A4) hold. Then, 



R2n = Op{r„ 



and 

where Mf^ > depends only on k and ipQ^\xQ) < 0. 

Proof. Define the function Pn(t) = (pn{t) — ^o{t) for any t £ [t~ ,t^]. 
Then, using Taylor expansion of /i i-^ exp(/i) up to order k, we can find 

^t,n G [Tn^'^n]^ SUch that 



, . J! ft! / '-^ ; j! 



where 



and 



:= / Ai(t)/o(tK(t)^ dt for 1 < i < A: - 1 

5„fc:= / Ai(t)/o(t)exp(^t,„K(t)'=(it. 

If we expand /o(i) around the mid-point r of [T~,r+], we get, for 1 < j < 
A; - 1 and a rin,t,j e [^rTi'^n ]) 

fc-l i-(0^:=N .r+ 



and, for j = k 

fo i v T" 



Snk = Y.^^ r ^l{t)eMOt,n){t-TyPn{tfdt 

+ /^^" 4!l^iM)Ai(t) exp(ei,0(t - r) ^it. 



It turns out that the dominating term in is the first term in the Taylor 
expansion of Sni- All the other terms are of smaller order since both p„ and 
are Op(l) uniformly in t G [t„ , t^'\. We denote this dominating 
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term by Qni- Since ipn is linear on [r„ ,r^], we write (pn{t) = (pn{T) + {t 
f)i^^(f ). By Taylor expansion of pn around f , we get 



Qln 

fo{f) 



Ai{t)pn{t)dt 



Pn{f) Ai{t)dt+p'^{f) Aiit){t-f)dt 



k ij) / -\ nT^ 



i=2 



Ai{t){t-fydt- 



en{t)Ai{t){t-fYdt, 



where the first two terms are zero, since (4.12) holds when A = Ai and 



I 1 1 oo as Tj^ 



(4.15) 



'n ^pO- Using the fact that 
Ai{t){t-fy dt 

0, for j = and j odd, 



\' n ' n I 



i+2 



2(i+2)(j + l)0- + 2) 
for j even, 



we conclude that 

Qln 



k 



\k+2 



and the claimed form of in the lemma follows. 

For we proceed along the lines of the proof of Lemma 4.1 in 

Groeneboom, Jongbloed and Wellner (2001b). This means we have to line 
up with the assumption of Theorem 2.14.1 in van der Vaart and Wellner 
(1996). Therefore, define a generalized version of R2n- 

A^{z)d{¥n-Fo){z) 



^2n 



for —oo < X <y. With this function, we have, for some R> 0, 



sup \R2n\ 
y:0<y-x<R 



2 sup 

y:0<y-x<R 

2 sup 

y:0<y-x<R 



{x+y)/2 



{z-x-\{y-x))d{¥n-Fo){z) 



K,y{z) d{¥n-Fo){z) 



where 



hx,y{z) = {z - X - \{y - x))\^^(.^+yy2]{z) = Kz)l^^^^^+yy2]{z). 
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Then, the collection of functions 

^x,R = {hl[x,{x+y)/2] ■.X<y<X + R} 

is a Vapnik-Chervonenkis subgraph class with envelope function 

Finally, Theorem 2.6.7 in van der Vaart and Wellner (1996) yields the en- 
tropy condition (4.5). 

A log-concave density is always unimodal and the value at the mode is 
finite, and hence, K := ||/o||oo is finite. Therefore, 

fX+R r-x+R ^2 r-x+R. 

= J {z - xf fo{z) dz + — J {z-x)fo{z)dz + — J fo{z)dz 
{K , RK ^ ,2 R^K \ 

48 

It follows from Lemma 4.1, with d = 2 and s = k, that R2n = Op{r~^). □ 
4.2. Proofs for Section 2. 

Lemma 4.5. For any M > 0, we have 

(4.16) sup \(p'^{xo + Snt) - ^o{xo)\ = Op(4"^), 

\t\<M 

(4.17) sup \ipn{xo + Snt) - ipo{xo) - Sntip'oixo)\ = Op(s^). 
|t|<M 

Furthermore, if we define, for any u G M, 



'-'f!>'\xo), .[¥^^(^0)]' 
j! k\ 



en[u) = fn[U) - [U - Xoy - fo[Xo)—^^ (u - Xq) 

j=0 



k 



then 



sup |e„(xo + Snt) - fo{xo){'^n{xo + Snt) - 'foixo) - S„t99o(xo))| 

\t\<M 

= op(4). 

Proof. The proof of (4.16) and (4.17) is identical to that of Lemma 4.4 
in Groeneboom, Jongbloed and Wellner (2001b) since the characterization 
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of fn given in (1.1) is (up to the direction of the inequahty) equivalent to 
that of the least-squares estimator of a convex density. 

Now, we prove (4.18). Using Taylor expansion of exp(/i) up to order 
k around zero, we can write 

fn{u) - fo{xo) = /o(xo)[exp((^„,(n) - ipo{xo)) - 1] 

(4.19) 

^1 • - 

= fo{xo) —{ifniu) - (po{xo)y + fo{xo)^k,n,l{u), 

where 

$fc,n,l(^) = —('Pn{u) - ipo{xo)y. 

j=k+l 

But, for any j > 1, 

{ifn{u) - (po{xo)y 

= [^n{u) - (po{xo) -{U- Xo)v:'o(xo) + {u - Xo)iPo{xo)y 

(4.20) = E (^) [^n(^x) - ^0(3:0) -{u- xo)v'{xo)V 

r=l ^ ' 

x['^(,(xo)r^(n-xo)^-'- 
+ bo(a;o)]^(^^-a;o)^- 
Hence, using (4.17) and (A3), we get on the set \u:\u — xq\ < Mn~^^^'^^'^^^} 
((^„,(n)-(^o(xo))''=Op(n-^'/(2fc+i)) 

for all j > /c + 1 . 

In particular, this implies that 

(4.21) $fc,n,i(") = Op(n-'=/(''=+^)), 

uniformly in u G [xq — tn~^^^'^'^~^^\xo + tn~^/(^'^+-^)], where |t| < M, and 
fn{u) - fo{xo) - fo{xo){ifn{u) - (po{xo) - (u - a;o)(/?d(a;o)) 

- /o(xo) E - ^o)^- = o,(n-'=/(2'=+i)). 

Using Lemma 4.2, the latter can be rewritten as 

fn{u) - fo{xo) - fo{xo){ipn{u) - ipo{xo) - {u - Xo)iPo{xo)) 

- E ^^{u - xoY - Hxo)^^^{u - xo)' = o,(n"'=/(^'=+i)) 
j=i 3- 
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or, equivalently, 

uniformly in |t| < M. □ 

Theorem 4.6. LetK>0. 

(i) If {Yk{t),t gM} is the canonical process defined in (2.1), then the 
localized process 7iYJ^™°*^(72-) converges weakly in C[—K,K] to Y^, where 



(4.22) 71 



(4.23) 72 



/o(xo)'=-Vf^(^o)h^/(^^^+^) 



P + 2)!]3 

/o(rro)|^(^)(xo)p xV(^^-+^) 
[(fc + 2)!]2 



Equivalently, Y^'^™°'^ converges weakly in C[—K,K] to the "driving process" 
Ya,k,a, where 

(4.24) n,a,<x(i) := a t W{s) ds - at''+^ 

Jo 

and where a = l/V/o(a^o), cr = \ip''Q^\xo)\ / {k + 2)!. 

(ii) The localized processes satisfy Yj^°'^™°'i(t) - ^^°'=™°'i(t) > 0, for all 
t G M, with equality for all t such that Xn{t) = xq+ tn~^/^'^^^^^ G Sn{<fn)- 

(iii) Both An and Bn defined above in (4-1) and (4-2) are tight. 

(iv) The vector of processes 

^^locmod ^^locmod^(l) ^^locmod^(2) Ylo^mod ^^Iocmod-j(3) ^^1°'^™°*^^ (1) ^ 

converges weakly in (C[— -ftT, -ftT])^ x {D[—K,K])'^, endowed with the product 
topology induced by the uniform topology on the spaces C[—K,K] and the 
Skorohod topology on the spaces D[—K,K] to the process 

(U ttW tt{2) y; rT{3) y(l) X 

\J^k,a,u, J^k,a,tT^ ^k,a,a^ ^ k,a,u , J^k,a,tT^ ^k,a,a)-' 

where Hi^^a,a is the unique process on M satisfying 

Hk,aAt)<Yk^aAt)^ for all ten, 

(4.25) I / {Hk^aA^) - Yk,aAt)) dHflAt) = 0, 
h['^\ , is concave. 
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Proof, (i) The first step will be to modify the local processes, that is, 
going from the "density" to the "log-density" level, in order to be able to 
exploit concavity of ipo and (fn and connect the local process to the limiting 
distribution obtained by Groeneboom, Jongbloed and Wellner (2001b) for 
estimating a convex density. 

First, by Lemma 4.2, (4.19) and (A3), we can write 

= $fc,n,i(n) + E ^[^-N - Mxo)y - E ^^^^^^(^x - xoY 

= ^k,n,l{u) + {(Pn{u) - ^o{xq) - tp'Q{xQ){u - Xq)) 

+ ^\[9n{u) - '^q{xq)]^ - E I^^^^^^(n - x^y 

j=2^- j=2 
= : {ipn{u) - ipo{xo) - ipo{xo){u - Xo)) + ^k,n,2{u), 

introducing the new remainder term 

1 

^^,71,2(1*) = ^fc,n,l('") + E ~[^"(^) ~ 'Po{xo)y 

(4.26) 

-j2^M^in-xoy. 



i=2 

Using (4.20) and (4.21) yields 

/ / ^k,n,2{u)dudv 



J 



UG[xo,v],VS:X 



k 1 



+ E~T / / [^n{u) - ipo{xo)y dudv 

j=2 



, j! Jx Jxo 



fc-1 ^ 



"Ett/ / [^oixo)y{u-xoy dudv 

j=2 ' 



, j! Jx Jxo 
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3=2-' 1=1 



+ E7rE(] / / Wn{u)-Mxo) 

. J ■ I_i \ ^ / JX JXQ 



- (M-Xo)y3o(xo)]' 

X {u- xoy"\Lp'Q{xQ)]^"^ dudv 
+ ^— / / yQ{xQ)Y{u- x^y dudv 



i=2 
fc-i 



+ E77E('J ) / / [^nH-(^o(xo) 



j=2 • i=l 

- {U - XQ)ip'Q{XQ)f 

X (u — xo)''~'[¥Jo(xo)]''~' (iudv 
'^l^.IiL ('"-^o)''[v'o(xo)]''dndz;. 
But by Lemma 4.5, one can easily show that, for j = 2, . . . ,k and I = 1,. . . ,j, 

'^n f f [^n{u) - ^o{xo) - {u- xo)ip'q{xo)]''{u- xoy~''[ip'o{xo)y~'- dudv 

= Op(n-['^('-i)+(^'-')]/(2'=+i))=Op(l), 
uniformly in |t| < M. Similarly, 



k 

+k+2 



Hence, it follows that 

Tn I £ $.,n,2(^) du dv = ^^^t'"^' + 0,{l) 

as n — >■ oo, uniformly in |t| < M . 

We turn now to the modified local processes, yJ^'^™°'^ and defined 
in (4.3) and (4.4). It is not difficult to show that 



(4.27) Yj,--°^(0 = - r„ / / ^k,n,2{u)dudv 



V 
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and 

(4.28) = -rn I I ^k,n,2N dudv. 



Note that the process j7^°cmod -g -^^ ^^^^ similar to except that it is 

defined in terms of the log-density ipQ instead of the density /q. This can be 
more easily seen from its original expression given in (4.4). The second ex- 
pression of ^f^°'=™°<i given above is only useful for showing that it stays below 
yiocmod^ while touching it at points t, such that x„(t) = xq + tn~^/(2'=+^) G 
Sni^n)- The biggest advantage of considering this modified version is to be 
able to use concavity of ipo the same way [Groeneboom, Jongbloed and Wellner 
(2001b)] used convexity of the true estimated density go. Their process H]^'^ 
resembles to a large extent (see page 1688), and by combining ar- 

guments similar to theirs with Lemma 4.2 and the results obtained above, 
it follows that 



V2 /'w^„W-^ I /O^^(^O) fk+2 HMf ,k+2 



ylocmod^^) 

= [/o(xo)]-^/^/V(.)d. + ^^t'=+2 

= Yk,aAi) inC[-K,K], 

where a := [/o(xo)]-^/^ a := \ipi^\xo)\ / {k + 2)!, as in (4.24). 
Now, let 7i and 72 be chosen, so that 

as processes where Yk is the integrated Gaussian process defined in (2.1). 

Using the scaling property of Brownian motion [i.e., a' 

for any a > 0] , we get 

3/2 _l , fc+2 -1 

71 72 = a and 7172 = a 
This yields 71 and 72 as given in (4.22) and (4.23), and hence, 

/ n'=/(2'=+i)(^„(xo)-c^o(xo)) \jif(,.~i( Ck{xo,^o)H^^\o)\ 

We get the explicit expression of the asymptotic constants Cfc(xo,</'o) and 
(ifc (2:0,970) using the following relations: 

(4.29) /o(2;o)~^Cfc(xo,y?o) = (7172)"^ and 

(4.30) fo{xo)-'dk{xo,^o) = (71 7l)"'- 
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This is completely analogous to the derivations on page 1689 in 
Groeneboom, Jongbloed and Wellner (2001b), precisely 



(4.31) 

and 

(4.32) 



(7i^r'"°'^(72t))(')(0)=7i7i(^")^'nO) 

= n('=-i)/(2^+i)/o(xo)4(xo, ^o)-n^U^o) - ¥^0(^0)). 



From (4.29) and (4.30), we get Ck{xo,(po) and (ifc(xo,(/?o) as given in (2.2) 
and (2.3), and Ck{xo,ipo) and Df^^XQ^ipo) as in (2.4) and (2.5). 
(ii) Note that we can write 



Yr (t) - K^t) = r„(EI„(x„(t)) - H^ixnim > 

by making use of (1.1) and the specific choice of An and Bn- But, since we 
connect if^°'=™°<i and yJ^'^™°'^ to the "invelope," the latter property needs 
primarily to hold for the modified processes. This can easily be established 
by considering (4.27) and (4.28), and hence it follows that 



Y: 



locmod 



(t)-i7i°='^°<^(t)>0 



for all t£R, with equality if Xn{t) =xo + tn~^/^'^^^'^^ £ Sn{Qn)- 

(iii) To show that An and Bn are tight. By Theorem 4.3, we know that 
there exists M > and r G S{(pn) such that < xq - r < Afn"^/^^''^^) with 
large probability. Now, using (4.11), we can write 

l^nl < I (F„(xo) - Fn{T)) - {¥n{xo) " F„(r))| + r„/n 



Aj) 



I [u-xoY j du 



3=0 

j2ltM^u-xoy-fo{u)]du 



J d(¥n - Fo] 



+ n 



-fc/(2fc+l) 



An2+An3 + n-'/(''^'\ 



Now, 



XQ 



e„(u) du - fo{xo){(pn{u) - ipo{xo) -{u- xo)(po{xo)) du 
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+ rnSnf o{xo) 
+ rnSnfoixo) 



k\ 



du 



I 

i(pn{u) - ipo{xo) - (u - xo)(p'q{xo)) du 



< Op(l) + Op{rnSn{T - Xq) ) + Op{rnSniT - Xq 

= Op(l), 



-fc/(2fc+l)^ 



where we used (4.18) and (4.17) to bound the first and last terms. To bound 
An2, we use Taylor approximation of fo{u) around xq to get 



An2 < 



A;! 



(u — xq)'^ du 



+ rr 



XQ 



(u — xo)''eniu) du 



where function such that ||e„|| — >p as xq — t — >p 0. To bound Ans, 

similar derivations as the ones used for bounding i?2n (see the proof of 
Lemma 4.4) can be employed where the perturbation function Ai needs to 
be replaced by A2ix) = l[^^^^]{x). 

At "one higher integration level," similar computations can be used to 
show tightness of Bn. 

(iv) The proof of this last part of the theorem is basically identical to that 
of Theorem 6.2 for the LSE in Groeneboom, Jongbloed and Wellner (2001b) 
and arguments similar to those of Groeneboom, Jongbloed and Wellner (2001a) 
or, alternatively, tightness plus uniqueness arguments along the lines of 
Groeneboom, Maathuis, and Wellner (2008). □ 



Proof of Theorem 2.1. The claimed joint convergence involving (pn 
and (p'^ follows from part (iv) of Theorem 4.6 and the relations (4.31) and 
(4.32). The joint limiting distribution of fn{xo) — foixo) and fn{xo) — /o(a;o) 
follows immediately by applying the delta-method. □ 

4.3. Proofs for Section 3. 

Proof of Theorem 3.6. We first use the simple fact that M„ is the 
only point x G M which satisfies 



(4.33) 




if t < X, 
ift>x. 



This follows immediately from concavity of (pn and the definition of Mn- 
Note that (pn may have a flat region or "modal interval"; in this case, there 
exists an entire interval of points where the maximum is attained, and Mn 
is the left endpoint of this interval. 
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A tightness property of the process H2 , which follows from Lemma 2.7 
of Groeneboom, Jongbloed and Wellner (2001b), is also needed to establish 
the limiting distribution of M„: for any e > and t E M, there exists C = C(e) 
such that 

P{\H^^\t) + 2At\ >C)<e. 

(3) 

In other words, one can view H2 (t) as an "estimator" of the odd function 

—24t. Since C is independent of it follows that, for a fixed (^) ^ 

[resp. H^ '{t) > 0] for t > (resp. -t < 0) big enough, with probability 
greater than 1 — e. 

(3) 

The sign of H2 and uniqueness of Mn turn out to be crucial in deter- 
mining the limiting distribution of the latter. From Theorem 4.6 and the 
two derivative relations, (4.31) and (4.32), it follows that 

'n'=/(2fc+i)((^„(xo + tn-V(2fc+i)) _ ^q(xo) - ta-i/(2fc+i)(^^(a;o)) ■ 
n('=-i)/(2fc+i)((^;(xo + tn-V(2fc+i)) _ (^'q(xo)) 

(4.34) 



in C[-K,K] X D[-K,K] 



for each > 0, with the product topology induced by the uniform topology 
on C\—K,K] and the Skorohod topology on D[—K,K]. Here, H^^a^a is the 
unique process on M satisfying (4.25). A similar result holds for the MLE 
of the log-concave density /q. When xq is replaced by the population mode 
mo = M(/o) and k = 2 the second weak convergence implies that 

and 

For T > large enough, this in turn implies that, for e > 0, we can find 
G N \ {0} such that, for all n> N, we have 

P{0n{mo - Tn~^'^) > and (^U?"o + Tn'^/^) < 0) > 1 - e. 

Using the property of M„ in (4.33), it follows that 

P{Mn e [mo - ^n-^/^ mo + Tn'^'^]) > 1 - e 

for all n> N. 

We first conclude that Mn — vtlq = Op{n^^^^). Then, we note that 
ni/5(M„-mo) = M(Z„), 
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where 

Zn{t) = n2/5(^„(7no + tn"^/^) - ipo{mo)) 

^Z{t):=Hfl^{t) \nC{[-K,K]) 

for each > 0, by (4.34) with k = 2. Thus, by the argmax continuous map- 
ping theorem [see, e.g., van der Vaart and Wellner (1996), page 286] it fol- 
lows that 

M(Z0 4M(Z) = M(i/g_J, 

where Z = HH^, a = 1 / fo{mo) , and cr = \(p\^ ^(mo)|/4!. 

Note that H2^a,u is related to the "driving process" l2,a,o- with a = l/-\/ fo{mQ), 
a = I (^[,^^ (mo) 1/4! as in (4.24) with k = 2. Now, -/iY2,a,ah2t) = Y2{t) as pro- 
cesses where I2 '■= ^2.1,1- Thus, it also holds that 

7iH2,aAl2t) = H2{t) and 7i7l^2,.(72t) = (0, 

or, equivalently, i/g,<,(t^) = i^f^ (1^/72) / (71 tD- Since M{dg{c-)) = c-Hd{g) 
for c,d> 0, it follows that 

r(2) ^d..fl„{2), 



V7172 / 

where 

by direct computation using /Q(mo) = = v3o(mo) and Lemma 4.2. □ 
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